Clusters, Concepts, and Pseudometrics

نویسندگان

  • Michael D. Rice
  • Michael Siff
چکیده

The fields of cluster analysis and concept analysis are both used to identify patterns in data. Concept analysis identifies similarities between sets of objects based on their attributes. Cluster analysis groups objects with related characteristics based on some notion of distance. In this paper, we investigate connections between these two approaches. In particular, for each binary relation defined on a set of objects O and attributes A, we define distance functions ρ (on the power set of O) and δ (on O). We prove that ρ and δ are pseudometrics and use them to • specify a clustering algorithm that computes a subset of the concept lattice • discover new interpretations of basic notions in concept analysis. In particular, we characterize concepts in terms of ρ and characterize a family of concept lattices based on all subsets with a fixed cardinality bound in terms of δ. Our clustering algorithm differs from the classical algorithms since, first, the values of ρ, not δ, determine which pairs of sets are combined at each level, and second, the clusters defined at each level in the algorithm are generally anti-chains and may not be partitions. Therefore, the analysis of the algorithm depends on the metric-geometry of ρ and is more involved than the analysis of the classical algorithms. We have developed a software environment that permits the execution of the algorithm on finite relations and the storage and analysis of the resulting clusters. The algorithm has been run on relations generated from a variety of sources ranging from medical research to sporting events. Our results indicate that the number of c ©2002 Published by Elsevier Science B. V.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kobayashi–royden vs. Hahn Pseudometric in C 2

We give a characterization of all cartesian products D1 × D2 ⊂ C for which the Kobayashi–Royden and Hahn pseudometrics coincide. In particular, we show that there exist domains in C for which Kobayashi–Royden and Hahn pseudometrics are different.

متن کامل

Pseudometrics for State Aggregation in Average Reward Markov Decision Processes

We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequate pseudometrics which are well adapted to the structure of the MDP, we show how these may be used for state aggregation. Upper bounds on the loss that may be caused by working on the aggregated instead of the original MDP are given and compared ...

متن کامل

Pseudometrics, distances and multivariate polynomial inequalities

We discuss three natural pseudodistances and pseudometrics on a bounded domain in IR based on polynomial inequalities.

متن کامل

Geometric and Electronic Structures of Vanadium Sub-nano Clusters, Vn (n = 2-5), and their Adsorption Complexes with CO and O2 Ligands: A DFT-NBO Study

In this study, electronic structures of ground state of pure vanadium sub-nano clusters, Vn (n=2-5), and their interactions with small ligands for example CO and triplet O2 molecules are investigated by using density functional theory (DFT) calibration at the mPWPW91/QZVP level of theory. The favorable orientations of these ligands in interaction with pure vanadium sub-nano clusters were determ...

متن کامل

Robust similarity between hypergraphs based on valuations and mathematical morphology operators

This article aims at connecting concepts of similarity, hypergraph and mathematical morphology. We introduce new measures of similarity and study their relations with pseudometrics defined on lattices. More precisely, based on various lattices that can be defined on hypergraphs, we propose some similarity measures between hypergraphs based on valuations and mathematical morphology operators. We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Electr. Notes Theor. Comput. Sci.

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2000